Introduction

Restatment of problem

stuff goes here ## Dataset stuff goes here ## Techniques used stuff goes here

Libraries

## ── Attaching packages ──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2     ✓ purrr   0.3.4
## ✓ tibble  3.0.3     ✓ dplyr   1.0.2
## ✓ tidyr   1.1.2     ✓ stringr 1.4.0
## ✓ readr   1.3.1     ✓ forcats 0.5.0
## ── Conflicts ─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## 
## Attaching package: 'magrittr'
## The following object is masked from 'package:purrr':
## 
##     set_names
## The following object is masked from 'package:tidyr':
## 
##     extract
## 
## Attaching package: 'mice'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     cbind, rbind
## Loading required package: colorspace
## Loading required package: grid
## VIM is ready to use.
## Suggestions and bug-reports can be submitted at: https://github.com/statistikat/VIM/issues
## 
## Attaching package: 'VIM'
## The following object is masked from 'package:datasets':
## 
##     sleep
## 
## Attaching package: 'psych'
## The following objects are masked from 'package:ggplot2':
## 
##     %+%, alpha
## Loading required package: lattice
## 
## Attaching package: 'caret'
## The following object is masked from 'package:purrr':
## 
##     lift

Utility functions

Import Data

Note: we use the merge data in Question 1 and therefor we need to perform step 2 first. # 2. Merge beer data first with the breweries data & Print first 6 and last 6 oservations in merged file.

Brewery_id Drink_name Beer_ID ABV IBU Style Ounces Brewery City State
1 Get Together 2692 0.045 50 American IPA 16 NorthGate Brewing Minneapolis MN
1 Maggie’s Leap 2691 0.049 26 Milk / Sweet Stout 16 NorthGate Brewing Minneapolis MN
1 Wall’s End 2690 0.048 19 English Brown Ale 16 NorthGate Brewing Minneapolis MN
1 Pumpion 2689 0.060 38 Pumpkin Ale 16 NorthGate Brewing Minneapolis MN
1 Stronghold 2688 0.060 25 American Porter 16 NorthGate Brewing Minneapolis MN
1 Parapet ESB 2687 0.056 47 Extra Special / Strong Bitter (ESB) 16 NorthGate Brewing Minneapolis MN
Brewery_id Drink_name Beer_ID ABV IBU Style Ounces Brewery City State
2405 556 Pilsner Ukiah 98 0.055 NA German Pilsener 12 Ukiah Brewing Company Ukiah CA
2406 557 Heinnieweisse Weissebier 52 0.049 NA Hefeweizen 12 Butternuts Beer and Ale Garrattsville NY
2407 557 Snapperhead IPA 51 0.068 NA American IPA 12 Butternuts Beer and Ale Garrattsville NY
2408 557 Moo Thunder Stout 50 0.049 NA Milk / Sweet Stout 12 Butternuts Beer and Ale Garrattsville NY
2409 557 Porkslap Pale Ale 49 0.043 NA American Pale Ale (APA) 12 Butternuts Beer and Ale Garrattsville NY
2410 558 Urban Wilderness Pale Ale 30 0.049 NA English Pale Ale 12 Sleeping Lady Brewing Company Anchorage AK

1. How many breweries are in each state?

See Table:

## `summarise()` ungrouping output (override with `.groups` argument)
## [1] "Total Unique Breweries: "
## [1] 558

3a. Plot missing data for reference

## Warning in plot.aggr(res, ...): not enough horizontal space to display
## frequencies

## 
##  Variables sorted by number of missings: 
##    Variable       Count
##         IBU 0.417012448
##         ABV 0.025726141
##       Style 0.002074689
##  Brewery_id 0.000000000
##  Drink_name 0.000000000
##     Beer_ID 0.000000000
##      Ounces 0.000000000
##     Brewery 0.000000000
##        City 0.000000000
##       State 0.000000000
##    Brewery_id     Drink_name           Beer_ID            ABV         
##  Min.   :  1.0   Length:2410        Min.   :   1.0   Min.   :0.00100  
##  1st Qu.: 94.0   Class :character   1st Qu.: 808.2   1st Qu.:0.05000  
##  Median :206.0   Mode  :character   Median :1453.5   Median :0.05600  
##  Mean   :232.7                      Mean   :1431.1   Mean   :0.05977  
##  3rd Qu.:367.0                      3rd Qu.:2075.8   3rd Qu.:0.06700  
##  Max.   :558.0                      Max.   :2692.0   Max.   :0.12800  
##                                                      NA's   :62       
##       IBU            Style               Ounces        Brewery         
##  Min.   :  4.00   Length:2410        Min.   : 8.40   Length:2410       
##  1st Qu.: 21.00   Class :character   1st Qu.:12.00   Class :character  
##  Median : 35.00   Mode  :character   Median :12.00   Mode  :character  
##  Mean   : 42.71                      Mean   :13.59                     
##  3rd Qu.: 64.00                      3rd Qu.:16.00                     
##  Max.   :138.00                      Max.   :32.00                     
##  NA's   :1005                                                          
##      City               State     
##  Length:2410        CO     : 265  
##  Class :character   CA     : 183  
##  Mode  :character   MI     : 162  
##                     IN     : 139  
##                     TX     : 130  
##                     OR     : 125  
##                     (Other):1406

3b. Assess missing data when no data exists for IBU OR ABV

  1. Special Release, The Crowler^tm, Can’d aid foundation are missing ABV/IBU/Style
  2. Cedar creek - Special Release is ambiguous, missing ABV/IBU/Style and will be dropped as it does not solve the QOI.
  3. Oskar Blues Brewery - The Crowler is not an actual beer but a type of can
  4. Oskar Blues Brewery - Can’d aid foundation is a relief effort that sends water so it does not fit in the dataset.
  5. Beer ID 2364, Royal Lager of Weston Brewing has no ABV/IBU
  6. Same for BID - 2322 Fort Pitt Brewing Company Fort Pitt Ale
  7. Oskar Blues Brewery Birth IPA, 1750
  8. 710, no data
  9. MillKing It Productions AXL Pale Ale, 273 - out of business no info
  10. 1095 no data
  11. 963 no data

3c. Enter in missing data when by hand when data is availbe publicly

-Add style data for 2527 and 1635 by looking it up by hand. -Add IBU and ABV Data for many missing rows by looking up by hand (online via BeerAdvocate.com or Untappd.com)

## Matching, by = "Beer_ID"

3d. Assess and impute missing IBU data with median IBU by style

Brewery_id ABV IBU Drink_name Style Ounces Brewery City State
Min. : 1.0 Min. :0.00100 Min. : 3.57 Length:2400 Length:2400 Min. : 8.40 Length:2400 Length:2400 CO : 261
1st Qu.: 94.0 1st Qu.:0.05000 1st Qu.: 21.00 Class :character Class :character 1st Qu.:12.00 Class :character Class :character CA : 183
Median :206.5 Median :0.05600 Median : 35.00 Mode :character Mode :character Median :12.00 Mode :character Mode :character MI : 161
Mean :232.6 Mean :0.05969 Mean : 42.59 NA NA Mean :13.58 NA NA IN : 139
3rd Qu.:367.0 3rd Qu.:0.06700 3rd Qu.: 64.00 NA NA 3rd Qu.:16.00 NA NA TX : 129
Max. :558.0 Max. :0.12800 Max. :138.00 NA NA Max. :32.00 NA NA OR : 125
NA NA NA’s :976 NA NA NA NA NA (Other):1402
## `summarise()` ungrouping output (override with `.groups` argument)
##  [1] Style               Brewery_id          ABV                
##  [4] Drink_name          Ounces              Brewery            
##  [7] City                State               median_IBU_by_style
## [10] IBU.clean          
## <0 rows> (or 0-length row.names)
Style Brewery_id ABV Drink_name Ounces Brewery City State median_IBU_by_style IBU.clean
Length:2348 Min. : 1 Min. :0.02700 Length:2348 Min. : 8.40 Length:2348 Length:2348 CO : 258 Min. : 8.00 Min. : 3.57
Class :character 1st Qu.: 92 1st Qu.:0.05000 Class :character 1st Qu.:12.00 Class :character Class :character CA : 181 1st Qu.:21.00 1st Qu.: 21.00
Mode :character Median :204 Median :0.05600 Mode :character Median :12.00 Mode :character Mode :character MI : 146 Median :30.00 Median : 32.00
NA Mean :231 Mean :0.05967 NA Mean :13.56 NA NA IN : 139 Mean :40.03 Mean : 40.46
NA 3rd Qu.:366 3rd Qu.:0.06700 NA 3rd Qu.:16.00 NA NA TX : 129 3rd Qu.:69.00 3rd Qu.: 60.00
NA Max. :558 Max. :0.12800 NA Max. :32.00 NA NA OR : 115 Max. :96.00 Max. :138.00
NA NA NA NA NA NA NA (Other):1380 NA NA

3e. Plot missing data to show it has all been resolved.

## 
##  Variables sorted by number of missings: 
##             Variable Count
##                Style     0
##           Brewery_id     0
##                  ABV     0
##           Drink_name     0
##               Ounces     0
##              Brewery     0
##                 City     0
##                State     0
##  median_IBU_by_style     0
##            IBU.clean     0
##     Style             Brewery_id       ABV           Drink_name       
##  Length:2348        Min.   :  1   Min.   :0.02700   Length:2348       
##  Class :character   1st Qu.: 92   1st Qu.:0.05000   Class :character  
##  Mode  :character   Median :204   Median :0.05600   Mode  :character  
##                     Mean   :231   Mean   :0.05967                     
##                     3rd Qu.:366   3rd Qu.:0.06700                     
##                     Max.   :558   Max.   :0.12800                     
##                                                                       
##      Ounces        Brewery              City               State     
##  Min.   : 8.40   Length:2348        Length:2348        CO     : 258  
##  1st Qu.:12.00   Class :character   Class :character   CA     : 181  
##  Median :12.00   Mode  :character   Mode  :character   MI     : 146  
##  Mean   :13.56                                         IN     : 139  
##  3rd Qu.:16.00                                         TX     : 129  
##  Max.   :32.00                                         OR     : 115  
##                                                        (Other):1380  
##  median_IBU_by_style   IBU.clean     
##  Min.   : 8.00       Min.   :  3.57  
##  1st Qu.:21.00       1st Qu.: 21.00  
##  Median :30.00       Median : 32.00  
##  Mean   :40.03       Mean   : 40.46  
##  3rd Qu.:69.00       3rd Qu.: 60.00  
##  Max.   :96.00       Max.   :138.00  
## 

4. Median ABV and IBU per state (See output for values)

## `summarise()` ungrouping output (override with `.groups` argument)

## [1] "Total Unique Breweries: "
## [1] 558

5. Which state has the maximum alcoholic (ABV) beer? Which state has the most bitter (IBU) beer? (See output for values)

Style Brewery_id ABV Drink_name Ounces Brewery City State median_IBU_by_style IBU.clean
Quadrupel (Quad) 52 0.128 Lee Hill Series Vol. 5 - Belgian Style Quadrupel Ale 19.2 Upslope Brewing Company Boulder CO 24 24
English Barleywine 2 0.125 London Balling 16.0 Against the Grain Brewery Louisville KY 60 80
Russian Imperial Stout 18 0.120 Csar 16.0 Tin Man Brewing Company Evansville IN 94 90
Rye Beer 52 0.104 Lee Hill Series Vol. 4 - Manhattan Style Rye Ale 19.2 Upslope Brewing Company Boulder CO 57 57
Baltic Porter 47 0.100 4Beans 12.0 Sixpoint Craft Ales Brooklyn NY 52 52
American Barleywine 310 0.099 Old Devil’s Tooth 12.0 Sockeye Brewing Company Boise ID 96 100
## [1] "highest ABV State"
##              Style Brewery_id   ABV
## 1 Quadrupel (Quad)         52 0.128
##                                             Drink_name Ounces
## 1 Lee Hill Series Vol. 5 - Belgian Style Quadrupel Ale   19.2
##                   Brewery    City State median_IBU_by_style IBU.clean
## 1 Upslope Brewing Company Boulder    CO                  24        24
Style Brewery_id ABV Drink_name Ounces Brewery City State median_IBU_by_style IBU.clean
American Double / Imperial IPA 375 0.082 Bitter Bitch Imperial IPA 12 Astoria Brewing Company Astoria OR 90.5 138
American IPA 345 0.059 Troopers Alley IPA 12 Wolf Hills Brewing Company Abingdon VA 69.0 135
American Double / Imperial IPA 231 0.090 Dead-Eye DIPA 16 Cape Ann Brewing Company Gloucester MA 90.5 130
American Double / Imperial IPA 100 0.089 Bay of Bengal Double IPA (2014) 12 Christian Moerlein Brewing Company Cincinnati OH 90.5 126
American Double / Imperial IPA 273 0.080 Heady Topper 16 The Alchemist Waterbury VT 90.5 120
American Double / Imperial IPA 62 0.097 Abrasive Ale 16 Surly Brewing Company Brooklyn Center MN 90.5 120
## [1] "highest IBU State"
##                            Style Brewery_id   ABV                Drink_name
## 1 American Double / Imperial IPA        375 0.082 Bitter Bitch Imperial IPA
##   Ounces                 Brewery    City State median_IBU_by_style IBU.clean
## 1     12 Astoria Brewing Company Astoria    OR                90.5       138

6. Summary statistics and Histogram for ABV

Style Brewery_id ABV Drink_name Ounces Brewery City State median_IBU_by_style IBU.clean
Length:2348 Min. : 1 Min. :0.02700 Length:2348 Min. : 8.40 Length:2348 Length:2348 CO : 258 Min. : 8.00 Min. : 3.57
Class :character 1st Qu.: 92 1st Qu.:0.05000 Class :character 1st Qu.:12.00 Class :character Class :character CA : 181 1st Qu.:21.00 1st Qu.: 21.00
Mode :character Median :204 Median :0.05600 Mode :character Median :12.00 Mode :character Mode :character MI : 146 Median :30.00 Median : 32.00
NA Mean :231 Mean :0.05967 NA Mean :13.56 NA NA IN : 139 Mean :40.03 Mean : 40.46
NA 3rd Qu.:366 3rd Qu.:0.06700 NA 3rd Qu.:16.00 NA NA TX : 129 3rd Qu.:69.00 3rd Qu.: 60.00
NA Max. :558 Max. :0.12800 NA Max. :32.00 NA NA OR : 115 Max. :96.00 Max. :138.00
NA NA NA NA NA NA NA (Other):1380 NA NA

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

7a. Summary statistics and Histogram for ABV

## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

7b. EDA continued

TODO: Speak to Assumptions for Linear Regression, and P-value and confidence interval for ABV estimate, scope of inference

## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced

## Warning in sqrt(crit * p * (1 - hh)/hh): NaNs produced

## 
## Call:
## lm(formula = IBU.clean ~ State + State * ABV + ABV, data = bdat.imputed.IBU.clean)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -79.146 -12.212  -1.991  12.028  87.085 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -129.29      37.59  -3.440 0.000593 ***
## StateAL        64.69      49.16   1.316 0.188368    
## StateAR       160.47      72.28   2.220 0.026506 *  
## StateAZ       120.87      40.10   3.014 0.002606 ** 
## StateCA        98.93      38.08   2.598 0.009428 ** 
## StateCO       114.12      37.95   3.007 0.002668 ** 
## StateCT       106.06      40.27   2.634 0.008500 ** 
## StateDC        61.86      49.86   1.241 0.214906    
## StateDE       225.29     117.43   1.918 0.055178 .  
## StateFL        70.90      40.51   1.750 0.080241 .  
## StateGA        85.36      51.27   1.665 0.096033 .  
## StateHI        94.06      43.05   2.185 0.028982 *  
## StateIA       116.19      41.40   2.806 0.005056 ** 
## StateID        89.04      40.08   2.222 0.026413 *  
## StateIL        86.22      38.63   2.232 0.025741 *  
## StateIN       122.21      38.23   3.197 0.001410 ** 
## StateKS        96.74      41.84   2.312 0.020845 *  
## StateKY       127.68      40.24   3.173 0.001529 ** 
## StateLA        86.83      42.71   2.033 0.042172 *  
## StateMA        88.40      39.11   2.260 0.023891 *  
## StateMD       116.52      43.51   2.678 0.007457 ** 
## StateME        95.31      40.22   2.370 0.017892 *  
## StateMI       135.25      38.24   3.537 0.000413 ***
## StateMN        99.19      39.18   2.531 0.011425 *  
## StateMO        75.51      41.31   1.828 0.067731 .  
## StateMS        85.94      46.38   1.853 0.064002 .  
## StateMT        85.41      43.54   1.962 0.049926 *  
## StateNC       119.23      39.22   3.040 0.002394 ** 
## StateND        45.58      73.36   0.621 0.534452    
## StateNE       106.82      41.52   2.573 0.010155 *  
## StateNH       109.89      51.59   2.130 0.033284 *  
## StateNJ        98.05      42.32   2.317 0.020596 *  
## StateNM        31.91      50.27   0.635 0.525694    
## StateNV       127.15      44.33   2.868 0.004167 ** 
## StateNY       100.31      38.69   2.592 0.009592 ** 
## StateOH       108.08      39.83   2.714 0.006706 ** 
## StateOK        95.50      42.82   2.230 0.025829 *  
## StateOR        80.52      38.51   2.091 0.036651 *  
## StatePA       137.92      38.60   3.573 0.000361 ***
## StateRI       117.62      41.30   2.848 0.004443 ** 
## StateSC        96.29      42.01   2.292 0.022006 *  
## StateSD       140.02      66.71   2.099 0.035939 *  
## StateTN        68.21      80.31   0.849 0.395799    
## StateTX        84.98      38.36   2.215 0.026829 *  
## StateUT       128.78      39.57   3.254 0.001153 ** 
## StateVA        81.82      41.03   1.994 0.046274 *  
## StateVT        82.02      40.62   2.019 0.043611 *  
## StateWA       121.64      39.69   3.065 0.002206 ** 
## StateWI       102.55      39.61   2.589 0.009677 ** 
## StateWV        19.39     169.13   0.115 0.908754    
## StateWY        63.82      49.02   1.302 0.193128    
## ABV          3001.54     672.17   4.465 8.38e-06 ***
## StateAL:ABV -1183.04     838.98  -1.410 0.158652    
## StateAR:ABV -2985.79    1354.72  -2.204 0.027626 *  
## StateAZ:ABV -2303.02     709.68  -3.245 0.001191 ** 
## StateCA:ABV -1785.93     679.02  -2.630 0.008593 ** 
## StateCO:ABV -2077.20     677.07  -3.068 0.002181 ** 
## StateCT:ABV -1951.42     710.11  -2.748 0.006043 ** 
## StateDC:ABV -1327.38     831.19  -1.597 0.110414    
## StateDE:ABV -3801.54    1890.86  -2.010 0.044499 *  
## StateFL:ABV -1337.25     716.63  -1.866 0.062166 .  
## StateGA:ABV -1531.73     909.58  -1.684 0.092320 .  
## StateHI:ABV -1792.26     765.74  -2.341 0.019342 *  
## StateIA:ABV -2229.85     730.51  -3.052 0.002296 ** 
## StateID:ABV -1552.43     707.95  -2.193 0.028422 *  
## StateIL:ABV -1599.72     686.70  -2.330 0.019917 *  
## StateIN:ABV -2224.00     680.72  -3.267 0.001103 ** 
## StateKS:ABV -1790.34     744.49  -2.405 0.016262 *  
## StateKY:ABV -2357.52     704.93  -3.344 0.000838 ***
## StateLA:ABV -1595.73     761.10  -2.097 0.036140 *  
## StateMA:ABV -1609.45     698.31  -2.305 0.021271 *  
## StateMD:ABV -2108.40     764.70  -2.757 0.005878 ** 
## StateME:ABV -1704.05     713.61  -2.388 0.017027 *  
## StateMI:ABV -2527.87     681.02  -3.712 0.000211 ***
## StateMN:ABV -1666.19     695.38  -2.396 0.016653 *  
## StateMO:ABV -1360.36     739.84  -1.839 0.066088 .  
## StateMS:ABV -1477.17     809.50  -1.825 0.068165 .  
## StateMT:ABV -1570.35     775.43  -2.025 0.042971 *  
## StateNC:ABV -2191.86     697.28  -3.143 0.001691 ** 
## StateND:ABV  -704.55    1331.48  -0.529 0.596756    
## StateNE:ABV -1967.10     735.33  -2.675 0.007524 ** 
## StateNH:ABV -2023.81     943.99  -2.144 0.032149 *  
## StateNJ:ABV -1648.93     743.89  -2.217 0.026748 *  
## StateNM:ABV  -545.71     862.92  -0.632 0.527193    
## StateNV:ABV -2305.26     752.69  -3.063 0.002220 ** 
## StateNY:ABV -1778.79     689.84  -2.579 0.009985 ** 
## StateOH:ABV -1958.42     702.82  -2.787 0.005373 ** 
## StateOK:ABV -1852.86     752.00  -2.464 0.013817 *  
## StateOR:ABV -1320.52     687.60  -1.920 0.054924 .  
## StatePA:ABV -2483.76     687.23  -3.614 0.000308 ***
## StateRI:ABV -2246.67     733.43  -3.063 0.002215 ** 
## StateSC:ABV -1830.39     735.23  -2.490 0.012863 *  
## StateSD:ABV -2675.35    1140.95  -2.345 0.019122 *  
## StateTN:ABV -1175.32    1444.81  -0.813 0.416031    
## StateTX:ABV -1598.07     683.80  -2.337 0.019525 *  
## StateUT:ABV -2234.21     709.70  -3.148 0.001665 ** 
## StateVA:ABV -1384.88     729.61  -1.898 0.057811 .  
## StateVT:ABV -1401.76     714.83  -1.961 0.050007 .  
## StateWA:ABV -2113.67     706.96  -2.990 0.002822 ** 
## StateWI:ABV -2001.65     710.22  -2.818 0.004869 ** 
## StateWV:ABV  -301.54    2734.91  -0.110 0.912216    
## StateWY:ABV -1237.25     879.26  -1.407 0.159519    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 18.75 on 2246 degrees of freedom
## Multiple R-squared:  0.4258, Adjusted R-squared:    0.4 
## F-statistic: 16.49 on 101 and 2246 DF,  p-value: < 2.2e-16

Note: American Pale Ale is VERY similar to IPA but we call it “other” ale # 8a. Data Prep for Difference with respect to IBU and ABV between IPAs and other Ales

##                    Abbey Single Ale                             Altbier 
##                                   2                                  13 
##              American Adjunct Lager            American Amber / Red Ale 
##                                  18                                 132 
##          American Amber / Red Lager                 American Barleywine 
##                                  28                                   3 
##                  American Black Ale                 American Blonde Ale 
##                                  36                                 108 
##                  American Brown Ale             American Dark Wheat Ale 
##                                  70                                   7 
##      American Double / Imperial IPA  American Double / Imperial Pilsner 
##                                 105                                   2 
##    American Double / Imperial Stout           American India Pale Lager 
##                                   9                                   3 
##                        American IPA             American Pale Ale (APA) 
##                                 423                                 244 
##                 American Pale Lager             American Pale Wheat Ale 
##                                  38                                  96 
##                    American Pilsner                     American Porter 
##                                  25                                  67 
##                      American Stout                 American Strong Ale 
##                                  39                                  14 
##                  American White IPA                   American Wild Ale 
##                                  11                                   6 
##                       Baltic Porter                    Belgian Dark Ale 
##                                   6                                  11 
##                         Belgian IPA                    Belgian Pale Ale 
##                                  18                                  24 
##             Belgian Strong Dark Ale             Belgian Strong Pale Ale 
##                                   6                                   7 
##                  Berliner Weissbier                      Bière de Garde 
##                                  11                                   7 
##                                Bock      California Common / Steam Beer 
##                                   7                                   6 
##                          Chile Beer                           Cream Ale 
##                                   3                                  29 
##                      Czech Pilsener                          Doppelbock 
##                                  28                                   7 
##           Dortmunder / Export Lager                              Dubbel 
##                                   6                                   5 
##                        Dunkelweizen                  English Barleywine 
##                                   4                                   3 
##                      English Bitter                   English Brown Ale 
##                                   3                                  18 
##               English Dark Mild Ale        English India Pale Ale (IPA) 
##                                   6                                  13 
##                    English Pale Ale               English Pale Mild Ale 
##                                  12                                   3 
##                       English Stout                  English Strong Ale 
##                                   2                                   4 
##                     Euro Dark Lager                     Euro Pale Lager 
##                                   5                                   2 
## Extra Special / Strong Bitter (ESB)                  Flanders Oud Bruin 
##                                  20                                   1 
##              Foreign / Export Stout              Fruit / Vegetable Beer 
##                                   6                                  49 
##                     German Pilsener                                Gose 
##                                  36                                  10 
##                            Grisette                          Hefeweizen 
##                                   1                                  40 
##                Herbed / Spiced Beer                     Irish Dry Stout 
##                                   9                                   5 
##                       Irish Red Ale          Keller Bier / Zwickel Bier 
##                                  12                                   3 
##                              Kölsch                               Lager 
##                                  42                                   1 
##                         Light Lager               Maibock / Helles Bock 
##                                  12                                   5 
##                Märzen / Oktoberfest                  Milk / Sweet Stout 
##                                  30                                  10 
##                 Munich Dunkel Lager                 Munich Helles Lager 
##                                   4                                  20 
##                       Oatmeal Stout                             Old Ale 
##                                  18                                   2 
##                               Other                         Pumpkin Ale 
##                                   1                                  23 
##                    Quadrupel (Quad)                              Radler 
##                                   4                                   3 
##                          Roggenbier              Russian Imperial Stout 
##                                   2                                  11 
##                            Rye Beer              Saison / Farmhouse Ale 
##                                  18                                  52 
##                         Schwarzbier              Scotch Ale / Wee Heavy 
##                                   9                                  15 
##                        Scottish Ale            Scottish-Style Amber Ale 
##                                  19                                   1 
##                         Smoked Beer                              Tripel 
##                                   1                                  11 
##                        Vienna Lager                           Wheat Ale 
##                                  20                                   1 
##                       Winter Warmer                             Witbier 
##                                  15                                  51
##      Style Brewery_id   ABV                   Drink_name Ounces
## 1 OtherAle         58 0.049      Abbey's Single (2015- )     12
## 2 OtherAle         58 0.049 Abbey's Single Ale (Current)     12
## 3 OtherAle        361 0.061                  Hot Rod Red     12
## 4 OtherAle        553 0.056      Mickey Finn's Amber Ale     12
## 5 OtherAle        102 0.052          Hurricane Amber Ale     12
## 6 OtherAle         83 0.052    Fat Tire Amber Ale (2011)     12
##                           Brewery          City State median_IBU_by_style
## 1                 Destihl Brewery   Bloomington    IL                  22
## 2                 Destihl Brewery   Bloomington    IL                  22
## 3         Aviator Brewing Company Fuquay-Varina    NC                  30
## 4           Mickey Finn's Brewery  Libertyville    IL                  30
## 5 Coastal Extreme Brewing Company       Newport    RI                  30
## 6     New Belgium Brewing Company  Fort Collins    CO                  30
##   IBU.clean
## 1        22
## 2        22
## 3        41
## 4        30
## 5        24
## 6        18

##    Length     Class      Mode 
##      2348 character character

8b. Test split

8c. KNN for IPA's vs Other Ales

## Confusion Matrix and Statistics
## 
##           classifications
##            IPA OtherAle
##   IPA       67        7
##   OtherAle  16      140
##                                           
##                Accuracy : 0.9             
##                  95% CI : (0.8537, 0.9355)
##     No Information Rate : 0.6391          
##     P-Value [Acc > NIR] : < 2e-16         
##                                           
##                   Kappa : 0.778           
##                                           
##  Mcnemar's Test P-Value : 0.09529         
##                                           
##             Sensitivity : 0.8072          
##             Specificity : 0.9524          
##          Pos Pred Value : 0.9054          
##          Neg Pred Value : 0.8974          
##              Prevalence : 0.3609          
##          Detection Rate : 0.2913          
##    Detection Prevalence : 0.3217          
##       Balanced Accuracy : 0.8798          
##                                           
##        'Positive' Class : IPA             
## 

8d. Linear Regression for IPA's vs Other Ales

##                         2.5 %      97.5 %
## (Intercept)          8.439664   22.633685
## StyleOtherAle      -23.536657   -5.964116
## ABV                707.658144  912.217775
## StyleOtherAle:ABV -373.372852 -101.203617
## 
## Call:
## lm(formula = IBU.clean ~ Style + Style * ABV + ABV, data = bdat.IPA.Vs.Ales.train)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -47.902  -9.137  -2.282   8.286  76.991 
## 
## Coefficients:
##                   Estimate Std. Error t value Pr(>|t|)    
## (Intercept)         15.537      3.618   4.295 1.88e-05 ***
## StyleOtherAle      -14.750      4.479  -3.293 0.001016 ** 
## ABV                809.938     52.136  15.535  < 2e-16 ***
## StyleOtherAle:ABV -237.288     69.367  -3.421 0.000644 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.42 on 1296 degrees of freedom
## Multiple R-squared:  0.6573, Adjusted R-squared:  0.6566 
## F-statistic: 828.8 on 3 and 1296 DF,  p-value: < 2.2e-16

8.e LDA and KNN to predict style based on IBU and ABV Individually

9

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateDE
## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Warning in preProcess.default(thresh = 0.95, k = 5, freqCut = 19, uniqueCut =
## 10, : These variables have zero variances: StateWV

## Confusion Matrix and Statistics
## 
##     classifications
##       12  16
##   12 127  14
##   16  50  36
##                                           
##                Accuracy : 0.7181          
##                  95% CI : (0.6547, 0.7756)
##     No Information Rate : 0.7797          
##     P-Value [Acc > NIR] : 0.9883          
##                                           
##                   Kappa : 0.3477          
##                                           
##  Mcnemar's Test P-Value : 1.214e-05       
##                                           
##             Sensitivity : 0.7175          
##             Specificity : 0.7200          
##          Pos Pred Value : 0.9007          
##          Neg Pred Value : 0.4186          
##              Prevalence : 0.7797          
##          Detection Rate : 0.5595          
##    Detection Prevalence : 0.6211          
##       Balanced Accuracy : 0.7188          
##                                           
##        'Positive' Class : 12              
##